RF: Use safe resizing for ArraySequence extension #724
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In #722, ArraySequence tests generate
ndarray.resize
failures similar to #719, when runningArraySequence._resize_data_to()
. Apparently the combination of Python 3.7 and coverage produce a ghost reference.In this case, it's not clear that it's being called in a safe way such that we should skip reference checks. I added a test that shows that pulling a slice for use is sufficient to create a reference that causes extension to fail. This seems to be a case where we don't want to skip the
refcheck
, as a resize that increases the array size will typically release the original memory, corrupting data that users would expect to be unchanged.The obvious solution is to switch to
np.resize
here, butnp.resize
returns views on an inaccessible object, so they don't have theOWNDATA
flag, which means we can no longer usendarray.resize
inArraySequence.shrink_data
. A possible remedy is usingnp.resize().copy()
, which will give us back control, and count on the garbage collector to take care of things, but until it does, we'll be seeing 3 copies of a single data array. This is in contrast tondarray.resize
, which usesrealloc
under the hood, so is pretty close to 1 copy.Here I add a
_safe_resize
helper function. Because we can guarantee the secondndarray.resize
will work, the intermediatea
will be deallocated byrealloc
, not the garbage collector. This will also continue to have the current memory profile, if nobody ever tries to slice the data and then extend (which seems to be the case, as we haven't had any bug reports).@MarcCote If you get some time, would you mind reviewing this? Or anyone else who feels comfortable.